Notes/Domino Fix List
 |  |
| SPR # TPAE6K5JZG | Fixed in 6.5.6 release |  |



Product Area: Client Technical Area: Instant Messaging Platform: Cross Platform
Lotus Customer Support APAR: LO11737

SPR# TPAE6K5JZG - Certain rare combinations of user actions can put the Notes client in a state where it sends a continuos stream of name resolve requests, potentially enough requests rapidly enough to overwhelm the Sametime server and cause it to crash. One symptom of this problem is if the nlnotes process on the Notes client spikes to 100%. Logging off from the Notes client instant messaging returns the CPU on the Nlnotes process to normal. The best indication of this problem is one (or more) Notes clients spiking to 100% CPU usage and the ST server(s) slowing to a crawl at the same time.

Technote Number: 1230709

Problem:
Under very specific circumstances the SametimeĀ® server can receive incoming
requests at an extremely high rate from the NotesĀ® client. These incoming
requests must be resolved in order for instant messaging users to communicate
and share presence information. As a result of receiving these requests at an
extremely high rate, the Sametime server can become unresponsive as it consumes
system resources during the processing of these incoming messages. The user
will receive an error message indicating that they have been logged out of
Sametime, with no indication as to why the user was disconnected. Disconnected
users will be able to immediately reconnect to the Sametime community.
The Sametime servers' state of unresponsiveness may manifest itself as
out-of-memory errors or by disconnecting from the Sametime Mux (which is used
to route instant messages).
Symptoms of this problem can include:
The nlnotes process on the Notes client spikes to 100%. Logging off from Notes
client IM returns the CPU on the nlnotes process to normal.
The CPU on multiple processes on the Sametime server are pegged at above normal
rates that may eventually reach 100%.
Attempting to open a chat session with another client shows the message
"Initializing chat: Resolving User Name". This is an organization-wide outage.
Attempting to add someone to a contact list will take an inordinate amount of
time.
On iSeries this problem will appear as an abnormal termination of the StMux
task. You would probably not notice a CPU spike because the CPU has so much
processing power available.
The ST Resolve process can also crash.
In additon to the Cumulative Client Hotfix (CCH) which has been released for
Notes 6.5.5 and 7.0 clients (see technote #1206369), a server-side defensive
workaround has been developed for Sametime 6.5.1 FP1 and 7.0 and integrated
into Sametime 7.5. This workaround will detect Notes Clients that enter the
looping state and disconnect them from the Sametime Community. The user will
not receive an error message stating why they were disconnected. Disconnected
users will be able to immediately reconnect to the Sametime community.
To determine the clients that exhibit the looping behavior, the STMux Server
Application counts incoming messages sent from a client. If the count exceeds
the predefined threshold, the STMux will disconnect the client and an error
will be recorded in the Sametime Log (STLog.nsf).
The threshold is configured using the following sametime.ini flags. (Please do
not change these values, as these are our recommendation to detect a client
attack correctly ):
[Config]
VPMX_THRESHOLD_INTERVAL=10
VPMX_THRESHOLD_MSG_COUNT=10000
VPMX_THRESHOLD_SEND_MSG_COUNT=5000
VPMX_THRESHOLD_CREATE_MSG_COUNT=5000
To get relevant traces, please add the following flag to sametime.ini under
[Debug] section
VPS_AUTH_DEBUG=1
VP_LDAP_TRACE=1
VP_REG_TRACE=1
VPS_DEBUG_CHANNEL_MSG=1
VPS_DEBUG_CONFIG=1
VPS_DEBUG_COP=1
VPS_DEBUG_GATEWAY_MSG=1
VPS_DEBUG_LOGIN_MSG=1
VPS_DEBUG_OTM_MSG=1
VPS_DEBUG_SERVICE_MSG=1
VPS_DEBUG_STATS_MSG=1
VPS_DEBUG_USER_MSG=1
VPMX_CLIENT_THRESHOLDS_DEBUG=1
UCM_DEBUG=1
UCM_KERNEL=1
UCM_SELECT=1
UCM_NOTIFY=1
UCM_MESSAGES=1
VPHMX_PURE_HTTP_DEBUG=1
VPMX_TCP_DEBUG=1
VPMX_CNL_DEBUG=1
VPMX_MSG_DEBUG=1
VPMX_HTTP_DEBUG=1
VPMX_DEBUG=1
VPMX_ROUTING_DEBUG=1
This debug will record trace information into the STMux_*.txt file located in
the Domino Trace (default: \lotus\domino\trace) directory. Any client that is
disconnected will see an error code equal to 80000233. The Sametime Log
(stlog.nsf) will also record the disconnect with the reason "Client exceeds
threshold".
To obtain the Sever-side defensive fix, please open a PMR with IBM Technical
Support.
A client-side fix has also been put into Notes 6.5.6 and Notes 7.0.2. Refer to
the Upgrade Central site for details on upgrading Notes/Domino.
Excerpt from the Lotus Notes and Domino Release 6.5.6 MR fix list (available at
http://www.ibm.com/developerworks/lotus):
SPR# TPAE6K5JZG - Certain rare combinations of user actions can put the Notes
client in a state where it sends a continuous stream of name resolve requests,
potentially enough requests rapidly enough to overwhelm the Sametime server and
cause it to crash. One symptom of this problem is if the nlnotes process on
the Notes client spikes to 100%. Logging off from the Notes client instant
messaging returns the CPU on the Nlnotes process to normal. The best
indication of this problem is one (or more) Notes clients spiking to 100% CPU
usage and the ST server(s) slowing to a crawl at the same time. More >


Last Modified on 12/10/2013
Go back
 |